首页> 外文OA文献 >BreakingNews: Article Annotation by Image and Text Processing
【2h】

BreakingNews: Article Annotation by Image and Text Processing

机译:BreakingNews:图像和文本处理的文章注释

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Current approaches lying in the intersection of computer vision and NLP have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these methods, though, rely on training sets of images associated with annotations that specifically describe the visual content. This paper proposes going a step further and explores more complex cases where textual descriptions are loosely related to images. We focus on the particular domain of News. We introduce new deep learning methods that address source and popularity prediction, article illustration, and article geolocation. An adaptive CNN is proposed, that shares most of the structure for all tasks, and is suitable for multitask and transfer learning. Deep CCA is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text, captions, and enriched with heterogeneous meta-data. BreakingNews allows exploring all aforementioned problems, for which we provide baseline performances using various CNN architectures, and different representations of the textual and visual features. We report promising results and bring to light several limitations of current state-of-the-art, which we hope will help spur progress in the field.
机译:计算机视觉和NLP相交的当前方法在自动字幕或图像检索等任务上取得了前所未有的突破。但是,这些方法大多数都依赖于与专门描述视觉内容的注释相关的图像训练集。本文提出了进一步的建议,并探讨了文本描述与图像松散相关的更复杂的情况。我们专注于新闻的特定领域。我们介绍了新的深度学习方法,这些方法解决了来源和受欢迎程度预测,文章插图和文章地理位置问题。提出了一种自适应CNN,该CNN共享所有任务的大部分结构,并且适用于多任务和转移学习。部署了深度CCA进行文章说明,并提出了基于大圆距的新损失函数进行地理位置定位。此外,我们展示了BreakingNews,这是一个新颖的数据集,包含约100K条新闻报道,包括图像,文本,标题,并富含异构元数据。 BreakingNews允许探索所有上述问题,为此,我们使用各种CNN架构以及文本和视觉功能的不同表示形式提供基准性能。我们报告了令人鼓舞的结果,并揭示了当前最新技术的一些局限性,我们希望这些局限性将有助于推动该领域的进步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号